89 research outputs found

    Isolating contour information from arbitrary images

    Get PDF
    Aspects of natural vision (physiological and perceptual) serve as a basis for attempting the development of a general processing scheme for contour extraction. Contour information is assumed to be central to visual recognition skills. While the scheme must be regarded as highly preliminary, initial results do compare favorably with the visual perception of structure. The scheme pays special attention to the construction of a smallest scale circular difference-of-Gaussian (DOG) convolution, calibration of multiscale edge detection thresholds with the visual perception of grayscale boundaries, and contour/texture discrimination methods derived from fundamental assumptions of connectivity and the characteristics of printed text. Contour information is required to fall between a minimum connectivity limit and maximum regional spatial density limit at each scale. Results support the idea that contour information, in images possessing good image quality, is (centered at about 10 cyc/deg and 30 cyc/deg). Further, lower spatial frequency channels appear to play a major role only in contour extraction from images with serious global image defects

    Spatial vision processes: From the optical image to the symbolic structures of contour information

    Get PDF
    The significance of machine and natural vision is discussed together with the need for a general approach to image acquisition and processing aimed at recognition. An exploratory scheme is proposed which encompasses the definition of spatial primitives, intrinsic image properties and sampling, 2-D edge detection at the smallest scale, the construction of spatial primitives from edges, and the isolation of contour information from textural information. Concepts drawn from or suggested by natural vision at both perceptual and physiological levels are relied upon heavily to guide the development of the overall scheme. The scheme is intended to provide a larger context in which to place the emerging technology of detector array focal-plane processors. The approach differs from many recent efforts in edge detection and image coding by emphasizing smallest scale edge detection as a foundation for multi-scale symbolic processing while diminishing somewhat the importance of image convolutions with multi-scale edge operators. Cursory treatments of information theory illustrate that the direct application of this theory to structural information in images could not be realized

    A discrepancy within primate spatial vision and its bearing on the definition of edge detection processes in machine vision

    Get PDF
    The visual perception of form information is considered to be based on the functioning of simple and complex neurons in the primate striate cortex. However, a review of the physiological data on these brain cells cannot be harmonized with either the perceptual spatial frequency performance of primates or the performance which is necessary for form perception in humans. This discrepancy together with recent interest in cortical-like and perceptual-like processing in image coding and machine vision prompted a series of image processing experiments intended to provide some definition of the selection of image operators. The experiments were aimed at determining operators which could be used to detect edges in a computational manner consistent with the visual perception of structure in images. Fundamental issues were the selection of size (peak spatial frequency) and circular versus oriented operators (or some combination). In a previous study, circular difference-of-Gaussian (DOG) operators, with peak spatial frequency responses at about 11 and 33 cyc/deg were found to capture the primary structural information in images. Here larger scale circular DOG operators were explored and led to severe loss of image structure and introduced spatial dislocations (due to blur) in structure which is not consistent with visual perception. Orientation sensitive operators (akin to one class of simple cortical neurons) introduced ambiguities of edge extent regardless of the scale of the operator. For machine vision schemes which are functionally similar to natural vision form perception, two circularly symmetric very high spatial frequency channels appear to be necessary and sufficient for a wide range of natural images. Such a machine vision scheme is most similar to the physiological performance of the primate lateral geniculate nucleus rather than the striate cortex

    Modes of Visual Recognition and Perceptually Relevant Sketch-based Coding for Images

    Get PDF
    A review of visual recognition studies is used to define two levels of information requirements. These two levels are related to two primary subdivisions of the spatial frequency domain of images and reflect two distinct different physical properties of arbitrary scenes. In particular, pathologies in recognition due to cerebral dysfunction point to a more complete split into two major types of processing: high spatial frequency edge based recognition vs. low spatial frequency lightness (and color) based recognition. The former is more central and general while the latter is more specific and is necessary for certain special tasks. The two modes of recognition can also be distinguished on the basis of physical scene properties: the highly localized edges associated with reflectance and sharp topographic transitions vs. smooth topographic undulation. The extreme case of heavily abstracted images is pursued to gain an understanding of the minimal information required to support both modes of recognition. Here the intention is to define the semantic core of transmission. This central core of processing can then be fleshed out with additional image information and coding and rendering techniques

    Method of improving a digital image

    Get PDF
    A method of improving a digital image is provided. The image is initially represented by digital data indexed to represent positions on a display. The digital data is indicative of an intensity value I.sub.i (x,y) for each position (x,y) in each i-th spectral band. The intensity value for each position in each i-th spectral band is adjusted to generate an adjusted intensity value for each position in each i-th spectral band in accordance with ##EQU1## where S is the number of unique spectral bands included in said digital data, W.sub.n is a weighting factor and * denotes the convolution operator. Each surround function F.sub.n (x,y) is uniquely scaled to improve an aspect of the digital image, e.g., dynamic range compression, color constancy, and lightness rendition. The adjusted intensity value for each position in each i-th spectral band is filtered with a common function and then presented to a display device. For color images, a novel color restoration step is added to give the image true-to-life color that closely matches human observation

    Scene Context Dependency of Pattern Constancy of Time Series Imagery

    Get PDF
    A fundamental element of future generic pattern recognition technology is the ability to extract similar patterns for the same scene despite wide ranging extraneous variables, including lighting, turbidity, sensor exposure variations, and signal noise. In the process of demonstrating pattern constancy of this kind for retinex/visual servo (RVS) image enhancement processing, we found that the pattern constancy performance depended somewhat on scene content. Most notably, the scene topography and, in particular, the scale and extent of the topography in an image, affects the pattern constancy the most. This paper will explore these effects in more depth and present experimental data from several time series tests. These results further quantify the impact of topography on pattern constancy. Despite this residual inconstancy, the results of overall pattern constancy testing support the idea that RVS image processing can be a universal front-end for generic visual pattern recognition. While the effects on pattern constancy were significant, the RVS processing still does achieve a high degree of pattern constancy over a wide spectrum of scene content diversity, and wide ranging extraneousness variations in lighting, turbidity, and sensor exposure

    Smart Image Enhancement Process

    Get PDF
    Contrast and lightness measures are used to first classify the image as being one of non-turbid and turbid. If turbid, the original image is enhanced to generate a first enhanced image. If non-turbid, the original image is classified in terms of a merged contrast/lightness score based on the contrast and lightness measures. The non-turbid image is enhanced to generate a second enhanced image when a poor contrast/lightness score is associated therewith. When the second enhanced image has a poor contrast/lightness score associated therewith, this image is enhanced to generate a third enhanced image. A sharpness measure is computed for one image that is selected from (i) the non-turbid image, (ii) the first enhanced image, (iii) the second enhanced image when a good contrast/lightness score is associated therewith, and (iv) the third enhanced image. If the selected image is not-sharp, it is sharpened to generate a sharpened image. The final image is selected from the selected image and the sharpened image

    The Spatial Vision Tree: A Generic Pattern Recognition Engine- Scientific Foundations, Design Principles, and Preliminary Tree Design

    Get PDF
    New foundational ideas are used to define a novel approach to generic visual pattern recognition. These ideas proceed from the starting point of the intrinsic equivalence of noise reduction and pattern recognition when noise reduction is taken to its theoretical limit of explicit matched filtering. This led us to think of the logical extension of sparse coding using basis function transforms for both de-noising and pattern recognition to the full pattern specificity of a lexicon of matched filter pattern templates. A key hypothesis is that such a lexicon can be constructed and is, in fact, a generic visual alphabet of spatial vision. Hence it provides a tractable solution for the design of a generic pattern recognition engine. Here we present the key scientific ideas, the basic design principles which emerge from these ideas, and a preliminary design of the Spatial Vision Tree (SVT). The latter is based upon a cryptographic approach whereby we measure a large aggregate estimate of the frequency of occurrence (FOO) for each pattern. These distributions are employed together with Hamming distance criteria to design a two-tier tree. Then using information theory, these same FOO distributions are used to define a precise method for pattern representation. Finally the experimental performance of the preliminary SVT on computer generated test images and complex natural images is assessed

    Processing Digital Imagery to Enhance Perceptions of Realism

    Get PDF
    Multi-scale retinex with color restoration (MSRCR) is a method of processing digital image data based on Edwin Land s retinex (retina + cortex) theory of human color vision. An outgrowth of basic scientific research and its application to NASA s remote-sensing mission, MSRCR is embodied in a general-purpose algorithm that greatly improves the perception of visual realism and the quantity and quality of perceived information in a digitized image. In addition, the MSRCR algorithm includes provisions for automatic corrections to accelerate and facilitate what could otherwise be a tedious image-editing process. The MSRCR algorithm has been, and is expected to continue to be, the basis for development of commercial image-enhancement software designed to extend and refine its capabilities for diverse applications

    Electro-Optical Design for Efficient Visual Communication

    Get PDF
    Visual communication, in the form of telephotography and television, for example, can be regarded as efficient only if the amount of information that it conveys about the scene to the observer approaches the maximum possible and the associated cost approaches the minimum possible. Elsewhere we have addressed the problem of assessing the end to end performance of visual communication systems in terms of their efficiency in this sense by integrating the critical limiting factors that constrain image gathering into classical communications theory. We use this approach to assess the electro-optical design of image gathering devices as a function of the f number and apodization of the objective lens and the aperture size and sampling geometry of the phot-detection mechanism. Results show that an image gathering device that is designed to optimize information capacity performs similarly to the human eye. For both, the performance approaches the maximum possible, in terms of the efficiency with which the acquired information can be transmitted as decorrelated data, and the fidelity, sharpness, and clearity with which fine detail can be restored
    corecore